Space and Time Efficient Implementations of a Parallel Direct Solver using Nested Dissection
نویسندگان
چکیده
This paper is concerned with algorithms for the efficient parallel solution of sparse, symmetric n × n linear systems via direct factorization, which eliminate groups of variables in stages. The algorithms make use of nested dissection to recursively cut data into pieces, producing orderings for direct elimination of the variables, so to reduce the storage and arithmetic requirements. Parallel Direct Solve (PDS) algorithms, which perform this variable elimination in parallel using nested dissection orderings, have been developed for various parallel machine models, including the parallel random access machine (PRAM) model and for grid architectures. In this paper, we describe the following improvements to the known DPS algorithms: 1. FAST DPS: a fast DPS algorithm for the PRAM model, which reduces the parallel time bound by a factor of O(log n), without significantly increasing the processor bound. 2. COMPACT DPS: a DPS algorithm for a mesh-connected processor array, applicable to matrices representable by a grid graph, that uses O(n) processors, takes O( √ n) parallel time, and reduces the space bounds to linear, without significant increase in time or processor bounds. 3. GENERALIZED COMPACT DPS: a DPS algorithm for a mesh-connected processor array, that uses O(n) processors, takes O( √ n) parallel time, and reduces the space bounds to linear, for the more general case of ∗A preliminary version of this paper appeared as Deganit Armon and John H. Reif, Space and time efficient implementations of parallel nested dissection, 4th Annual ACM Symposium on Parallel Algorithms and Architectures, San Diego, CA, July 1992. †Afeka Tel Aviv Academic College of Engineering ‡Department of Computer Science, Duke University, Durham, NC 27708, USA and Adjunct, Faculty of Computing and Information Technology (FCIT), King Abdulaziz University (KAU), Jeddah, Saudi Arabia. Email: [email protected]. matrices representable by a graph of constant degree and separator size √ n (as required for many 2D PDE applications).
منابع مشابه
Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields
This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...
متن کاملPerformance of a Fully Parallel Sparse Solver
The performance of a fully parallel direct solver for large sparse symmetric positive deenite systems of linear equations is demonstrated. The solver is designed for distributed-memory, message-passing parallel computer systems. All phases of the computation, including symbolic processing as well as numeric factorization and triangular solution, are performed in parallel. A parallel Cartesian n...
متن کاملAnisotropic ‘helmholtz’ Equations: Massively Parallel Structured Multifrontal Solver Using Nested Dissection Based Domain Decomposition with Separators of Variable Thickness
Abstract. We consider the discretization and approximate solution of inhomogeneous anisotropic ‘Helmholtz’ equations in 3D. The anisotropy comprises general (tilted) TI symmetries. In particular, we are concerned with solving these equations on a large domain, for a large number of different sources. We make use of a nested dissection based domain decomposition in a massively parallel multifron...
متن کاملParallel Ordering Using Edge Contraction
Computing a ll-reducing ordering of a sparse matrix is a central problem in the solution of sparse linear systems using direct methods. In recent years, there has been signiicant research in developing a sparse direct solver suitable for message-passing multiprocessors. However, computing the ordering step in parallel remains a challenge and there are very few methods available. This paper desc...
متن کاملSimulation of Earthquake Liquefaction Response on Parallel Computers
This paper presents a parallel nonlinear finite element program, ParCYCLIC, which is designed for the analysis of cyclic seismically-induced liquefaction problems. Key elements of the computational strategy employed in ParCYCLIC include the deployment of an automatic domain decomposer, the use of the multilevel nested dissection algorithm for the ordering of finite element nodes, and the develo...
متن کامل